A Practical Algorithm for Intersecting Weighted Context-free Grammars with Finite-State Automata
نویسنده
چکیده
It is well known that context-free parsing can be seen as the intersection of a contextfree language with a regular language (or, equivalently, the intersection of a context-free grammar with a finite-state automaton). The present article provides a practical efficient way to compute this intersection by converting the grammar into a special finite-state automaton (the GLR(0)-automaton) which is subsequently intersected with the given finite-state automaton. As a byproduct, we present a generalisation of Tomita’s algorithm to recognize several inputs simultaneously.
منابع مشابه
Intersecting Hierarchical and Phrase-Based Models of Translation: Formal Aspects and Algorithms
We address the problem of constructing hybrid translation systems by intersecting a Hiero-style hierarchical system with a phrase-based system and present formal techniques for doing so. We model the phrase-based component by introducing a variant of weighted finite-state automata, called σ-automata, provide a self-contained description of a general algorithm for intersecting weighted synchrono...
متن کاملContext-Free Recognition with Weighted Automata
We introduce the deenition of language recognition with weighted automata, a generalization of the classical deenition of recognition with un-weighted acceptors. We show that, with our definition of recognition, weighted automata can be used to recognize a class of languages that strictly includes regular languages. The class of languages accepted depends on the weight set which has the algebra...
متن کاملChapter 9 R EGULAR A PPROXIMATION OF C ONTEXT - F REE G RAMMARS THROUGH T RANSFORMATION
We present an algorithm for approximating context-free languages with regular languages. The algorithm is based on a simple transformation that applies to any context-free grammar and guarantees that the result can be compiled into a finite automaton. The resulting grammar contains at most one new nonterminal for any nonterminal symbol of the input grammar. The result thus remains readable and ...
متن کاملFinding the Most Probable String and the Consensus String: an Algorithmic Study
The problem of finding the most probable string for a distribution generated by a weighted finite automaton or a probabilistic grammar is related to a number of important questions: computing the distance between two distributions or finding the best translation (the most probable one) given a probabilistic finite state transducer. The problem is undecidable with general weights and is NP-hard ...
متن کاملParsing with Pictures
The development of elegant and practical algorithms for parsing context-free languages is one of the major accomplishments of 20 century Computer Science. These algorithms are presented in the literature using string rewriting systems or abstract machines like pushdown automata, but the resulting descriptions are unsatisfactory for several reasons. First, even a basic understanding of parsing a...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011